26 research outputs found

    Multilinear algebra for phylogenetic reconstruction

    Get PDF
    Phylogenetic reconstruction tries to recover the ancestral relationships among a group of contemporary species and represent them in a phylogenetic tree. To do it, it is useful to model evolution adopting a parametric statistic model. Using these models one is able to deduce polynomial relationships between the observed probabilities, known as phylogenetic invariants. Mathematicians have recently begun to be interested in the study of these polynomials and have developed techniques from algebraic geometry that have already been used in the study of phylogenetics. Nowadays there exist some phylogenetic reconstruction methods based in these phylogenetic invariants. In this project we study some theoretical results on stochasticity conditions of the parameters of the model and we analyze whether they give some new information to these reconstruction methods. We implement the conditions and analyze the results comparing them with the results provided by the reconstruction method Erik+2. Finally we propose a new reconstruction method based in the same ideas, with different implementation, and with very good results on simulated data

    Multilinear algebra for phylogenetic reconstruction

    Get PDF
    Phylogenetic reconstruction tries to recover the ancestral relationships among a group of contemporary species and represent them in a phylogenetic tree. To do it, it is useful to model evolution adopting a parametric statistic model. Using these models one is able to deduce polynomial relationships between the observed probabilities, known as phylogenetic invariants. Mathematicians have recently begun to be interested in the study of these polynomials and have developed techniques from algebraic geometry that have already been used in the study of phylogenetics. Nowadays there exist some phylogenetic reconstruction methods based in these phylogenetic invariants. In this project we study some theoretical results on stochasticity conditions of the parameters of the model and we analyze whether they give some new information to these reconstruction methods. We implement the conditions and analyze the results comparing them with the results provided by the reconstruction method Erik+2. Finally we propose a new reconstruction method based in the same ideas, with different implementation, and with very good results on simulated data

    Phylogenetics and rank of matrices

    Get PDF
    Phylogenetics is the study of the evolutionary relationships of a group of species, and it is usually inferred from the DNA sequences of a set of living species. Thanks to the model developed by Darwin based on natural selection, we can construct phylogenetics trees that relate a set of contemporary species. Thus phylogenetics tries to reconstruct these trees. In order to do it is necessary to model evolution adopting a parametric statistic model. Using these models one is able to deduce polynomial relationships between the parameters of our model, known as phylogenetic invariants. The main goal of this work is to understand the relationship between phylogenetics and these algebraic techniques and to prove that using the rank of certain matrices we can find phylogenetic invariants that are useful for tree reconstruction

    SAQ: semi-algebraic quartet reconstruction method

    Get PDF
    We present the phylogenetic quartet reconstruction method SAQ (Semi-algebraic quartet reconstruction). SAQ is consistent with the most general Markov model of nucleotide substitution and, in particular, it allows for rate heterogeneity across lineages. Based on the algebraic and semi-algebraic description of distributions that arise from the general Markov model on a quartet, the method outputs normalized weights for the three trivalent quartets (which can be used as input of quartet-base methods). We show that SAQ is a highly competitive method that outperforms most of the well known reconstruction methods on data simulated under the general Markov model on 4-taxon trees. Moreover, it also achieves a high performance on data that violates the underlying assumptions

    The inertia of the symmetric approximation for low-rank matrices

    Get PDF
    © 2017 Informa UK Limited, trading as Taylor & Francis Group In many areas of applied linear algebra, it is necessary to work with matrix approximations. A usual situation occurs when a matrix obtained from experimental or simulated data is needed to be approximated by a matrix that lies in a corresponding statistical model and satisfies some specific properties. In this short note, we focus on symmetric and positive-semidefinite approximations and we show that the positive and negative indices of inertia of the symmetric approximation and the rank of the positive-semidefinite approximation are always bounded from above by the rank of the original matrix.Peer ReviewedPostprint (published version

    SAQ: semi-algebraic quartet reconstruction

    Get PDF
    © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.We present the phylogenetic quartet reconstruction method SAQ (Semi-Algebraic Quartet reconstruction). SAQ is consistent with the most general Markov model of nucleotide substitution and, in particular, it allows for rate heterogeneity across lineages. Based on the algebraic and semi-algebraic description of distributions that arise from the general Markov model on a quartet, the method outputs normalized weights for the three trivalent quartets (which can be used as input of quartet-based methods). We show that SAQ is a highly competitive method that outperforms most of the well known reconstruction methods on data simulated under the general Markov model on 4-taxon trees. Moreover, it also achieves a high performance on data that violates the underlying assumptions.The authors were partially supported by Spanish government Secretar´ıa de Estado de Investigaci´on, Desarrollo e Innovaci´on [MTM2015-69135-P (MINECO/FEDER)] and [PID2019- 103849GB-I00 (MINECO)]; Generalitat de Catalunya [2014 SGR-634]. M. Garrote-L´opez was also funded by Spanish government, Ministerio de Econom´ıa y Competitividad research project Maria de Maeztu [MDM-2014-0445].Peer ReviewedPostprint (author's final draft

    Designing weights for quartet-based methods when data are heterogeneous across lineages

    Get PDF
    Homogeneity across lineages is a general assumption in phylogenetics according to which nucleotide substitution rates are common to all lineages. Many phylogenetic methods relax this hypothesis but keep a simple enough model to make the process of sequence evolution more tractable. On the other hand, dealing successfully with the general case (heterogeneity of rates across lineages) is one of the key features of phylogenetic reconstruction methods based on algebraic tools. The goal of this paper is twofold. First, we present a new weighting system for quartets (ASAQ) based on algebraic and semi-algebraic tools, thus especially indicated to deal with data evolving under heterogeneous rates. This method combines the weights of two previous methods by means of a test based on the positivity of the branch lengths estimated with the paralinear distance. ASAQ is statistically consistent when applied to data generated under the general Markov model, considers rate and base composition heterogeneity among lineages and does not assume stationarity nor time-reversibility. Second, we test and compare the performance of several quartet-based methods for phylogenetic tree reconstruction (namely QFM, wQFM, quartet puzzling, weight optimization and Willson’s method) in combination with several systems of weights, including ASAQ weights and other weights based on algebraic and semi-algebraic methods or on the paralinear distance. These tests are applied to both simulated and real data and support weight optimization with ASAQ weights as a reliable and successful reconstruction method that improves upon the accuracy of global methods (such as neighbor-joining or maximum likelihood) in the presence of long branches or on mixtures of distributions on trees.We would like to thank the reviewers of the paper for important contributions that improved the final version of the manuscript. MC, JFS and MGL were partially supported by Spanish State Research Agency grant PID2019-103849GB-I00. MC and JFS were also supported by AEI through the Severo Ochoa and María de Maeztu Program for Centers and Units of Excellence in R &D (project CEX2020-001084-M) and by the AGAUR project 2021 SGR 00603 Geometry of Manifolds and Applications, GEOMVAP.Peer ReviewedPostprint (published version

    Risk factors associated with adverse fetal outcomes in pregnancies affected by Coronavirus disease 2019 (COVID-19): a secondary analysis of the WAPM study on COVID-19.

    Get PDF
    Objectives To evaluate the strength of association between maternal and pregnancy characteristics and the risk of adverse perinatal outcomes in pregnancies with laboratory confirmed COVID-19. Methods Secondary analysis of a multinational, cohort study on all consecutive pregnant women with laboratory-confirmed COVID-19 from February 1, 2020 to April 30, 2020 from 73 centers from 22 different countries. A confirmed case of COVID-19 was defined as a positive result on real-time reverse-transcriptase-polymerase-chain-reaction (RT-PCR) assay of nasal and pharyngeal swab specimens. The primary outcome was a composite adverse fetal outcome, defined as the presence of either abortion (pregnancy loss before 22 weeks of gestations), stillbirth (intrauterine fetal death after 22 weeks of gestation), neonatal death (death of a live-born infant within the first 28 days of life), and perinatal death (either stillbirth or neonatal death). Logistic regression analysis was performed to evaluate parameters independently associated with the primary outcome. Logistic regression was reported as odds ratio (OR) with 95% confidence interval (CI). Results Mean gestational age at diagnosis was 30.6+/-9.5 weeks, with 8.0% of women being diagnosed in the first, 22.2% in the second and 69.8% in the third trimester of pregnancy. There were six miscarriage (2.3%), six intrauterine device (IUD) (2.3) and 5 (2.0%) neonatal deaths, with an overall rate of perinatal death of 4.2% (11/265), thus resulting into 17 cases experiencing and 226 not experiencing composite adverse fetal outcome. Neither stillbirths nor neonatal deaths had congenital anomalies found at antenatal or postnatal evaluation. Furthermore, none of the cases experiencing IUD had signs of impending demise at arterial or venous Doppler. Neonatal deaths were all considered as prematurity-related adverse events. Of the 250 live-born neonates, one (0.4%) was found positive at RT-PCR pharyngeal swabs performed after delivery. The mother was tested positive during the third trimester of pregnancy. The newborn was asymptomatic and had negative RT-PCR test after 14 days of life. At logistic regression analysis, gestational age at diagnosis (OR: 0.85, 95% CI 0.8-0.9 per week increase; pPeer reviewe
    corecore